An Ensemble approach on Missing Value Handling in Hepatitis Disease Dataset
نویسندگان
چکیده
The Major work in data pre-processing is handling Missing value imputation in Hepatitis Disease Diagnosis which is one of the primary stage in data mining. Many health datasets are typically imperfect. Just removing the cases from the original datasets can fetch added problems than elucidations. A appropriate technique for missing value imputation can assist to generate high-quality datasets for enhanced scrutinizing in clinical trials. This paper investigates the exploit of a machine learning technique as a missing value imputation process for incomplete Hepatitis data. Mean/mode imputation, ID3 algorithm imputation, decision tree imputation and proposed bootstrap aggregation based imputation are used as missing value imputation and the resultant datasets are classified using KNN. The experiment reveals that classifier performance is enhanced when the Bagging based imputation algorithm is used to foresee missing attribute values.
منابع مشابه
Investigating the missing data effect on credit scoring rule based models: The case of an Iranian bank
Credit risk management is a process in which banks estimate probability of default (PD) for each loan applicant. Data sets of previous loan applicants are built by gathering their data, and these internal data sets are usually completed using external credit bureau’s data and finally used for estimating PD in banks. There is also a continuous interest for bank to use rule based classifiers to b...
متن کاملClassifier Ensemble Framework: a Diversity Based Approach
Pattern recognition systems are widely used in a host of different fields. Due to some reasons such as lack of knowledge about a method based on which the best classifier is detected for any arbitrary problem, and thanks to significant improvement in accuracy, researchers turn to ensemble methods in almost every task of pattern recognition. Classification as a major task in pattern recognition,...
متن کاملDEA with Missing Data: An Interval Data Assignment Approach
In the classical data envelopment analysis (DEA) models, inputs and outputs are assumed as known variables, and these models cannot deal with unknown amounts of variables directly. In recent years, there are few researches on handling missing data. This paper suggests a new interval based approach to apply missing data, which is the modified version of Kousmanen (2009) approach. First, the prop...
متن کاملKnowledge Mining from Clinical Datasets Using Rough Sets and Backpropagation Neural Network
The availability of clinical datasets and knowledge mining methodologies encourages the researchers to pursue research in extracting knowledge from clinical datasets. Different data mining techniques have been used for mining rules, and mathematical models have been developed to assist the clinician in decision making. The objective of this research is to build a classifier that will predict th...
متن کاملA New Approach for Handling Null Values in Web Log Using KNN and Tabu Search KNN
When the data mining procedures deals with the extraction of interesting knowledge from web logs is known as Web usage mining. The result of any mining is successful, only if the dataset under consideration is well preprocessed. One of the important preprocessing steps is handling of null/missing values. Handlings of null values have been a great bit of test for researcher. Various methods are ...
متن کامل